21 research outputs found

    Adaptive Scattering Transforms for Playing Technique Recognition

    Get PDF
    Playing techniques contain distinctive information about musical expressivity and interpretation. Yet, current research in music signal analysis suffers from a scarcity of computational models for playing techniques, especially in the context of live performance. To address this problem, our paper develops a general framework for playing technique recognition. We propose the adaptive scattering transform, which refers to any scattering transform that includes a stage of data-driven dimensionality reduction over at least one of its wavelet variables, for representing playing techniques. Two adaptive scattering features are presented: frequency-adaptive scattering and direction-adaptive scattering. We analyse seven playing techniques: vibrato, tremolo, trill, flutter-tongue, acciaccatura, portamento, and glissando. To evaluate the proposed methodology, we create a new dataset containing full-length Chinese bamboo flute performances (CBFdataset) with expert playing technique annotations. Once trained on the proposed scattering representations, a support vector classifier achieves state-of-the-art results. We provide explanatory visualisations of scattering coefficients for each technique and verify the system over three additional datasets with various instrumental and vocal techniques: VPset, SOL, and VocalSet

    Playing Technique Recognition by Joint Time–Frequency Scattering

    Get PDF
    Playing techniques are important expressive elements in music signals. In this paper, we propose a recognition system based on the joint time–frequency scattering transform (jTFST) for pitch evolution-based playing techniques (PETs), a group of playing techniques with monotonic pitch changes over time. The jTFST represents spectro-temporal patterns in the time–frequency domain, capturing discriminative information of PETs. As a case study, we analyse three commonly used PETs of the Chinese bamboo flute: acciacatura, portamento, and glissando, and encode their characteristics using the jTFST. To verify the proposed approach, we create a new dataset, the CBF-petsDB, containing PETs played in isolation as well as in the context of whole pieces performed and annotated by professional players. Feeding the jTFST to a machine learning classifier, we obtain F-measures of 71% for acciacatura, 59% for portamento, and 83% for glissando detection, and provide explanatory visualisations of scattering coefficients for each technique

    Adaptive Time–Frequency Scattering for Periodic Modulation Recognition in Music Signals

    Get PDF
    Vibratos, tremolos, trills, and flutter-tongue are techniques frequently found in vocal and instrumental music. A common feature of these techniques is the periodic modulation in the time--frequency domain. We propose a representation based on time--frequency scattering to model the inter-class variability for fine discrimination of these periodic modulations. Time--frequency scattering is an instance of the scattering transform, an approach for building invariant, stable, and informative signal representations. The proposed representation is calculated around the wavelet subband of maximal acoustic energy, rather than over all the wavelet bands. To demonstrate the feasibility of this approach, we build a system that computes the representation as input to a machine learning classifier. Whereas previously published datasets for playing technique analysis focus primarily on techniques recorded in isolation, for ecological validity, we create a new dataset to evaluate the system. The dataset, named CBF-periDB, contains full-length expert performances on the Chinese bamboo flute that have been thoroughly annotated by the players themselves. We report F-measures of 99% for flutter-tongue, 82% for trill, 69% for vibrato, and 51% for tremolo detection, and provide explanatory visualisations of scattering coefficients for each of these techniques

    MtSNPscore: a combined evidence approach for assessing cumulative impact of mitochondrial variations in disease

    Get PDF
    Human mitochondrial DNA (mtDNA) variations have been implicated in a broad spectrum of diseases. With over 3000 mtDNA variations reported across databases, establishing pathogenicity of variations in mtDNA is a major challenge. We have designed and developed a comprehensive weighted scoring system (MtSNPscore) for identification of mtDNA variations that can impact pathogenicity and would likely be associated with disease. The criteria for pathogenicity include information available in the literature, predictions made by various in silico tools and frequency of variation in normal and patient datasets. The scoring scheme also assigns scores to patients and normal individuals to estimate the cumulative impact of variations. The method has been implemented in an automated pipeline and has been tested on Indian ataxia dataset (92 individuals), sequenced in this study, and other publicly available mtSNP dataset comprising of 576 mitochondrial genomes of Japanese individuals from six different groups, namely, patients with Parkinson's disease, patients with Alzheimer's disease, young obese males, young non-obese males, and type-2 diabetes patients with or without severe vascular involvement. MtSNPscore, for analysis can extract information from variation data or from mitochondrial DNA sequences. It has a web-interface http://bioinformatics.ccmb.res.in/cgi-bin/snpscore/Mtsnpscore.pl webcite that provides flexibility to update/modify the parameters for estimating pathogenicity

    Capacity Analysis of MIMO-WLAN Systems with Single Co-Channel Interference

    Get PDF
    [[abstract]]In this paper, channel capacity of multiple-input multiple-output wireless local area network (MIMO-WLAN) systems with single co-channel interference (CCI) is calculated. A ray-tracing approach is used to calculate the channel frequency response, which is further used to calculate the corresponding channel capacity. The ability to combat CCI for the MIMO-WLAN simple uniform linear array (ULA) and polarization diversity array (PDA) are investigated. Also the effects caused by two antenna arrays for desired system and CCI are quantified. Numerical results show that MIMO-PDA is better than those of MIMO-ULA when interference is present.[[notice]]補正完畢[[incitationindex]]EI[[booktype]]紙本[[booktype]]電子

    Kymatio: Deep Learning meets Wavelet Theory for Music Signal Processing

    No full text
    We present a tutorial on MIR with the open-source Kymatio (Andreux et al., 2020) toolkit for analysis and synthesis of music signals and timbre with differentiable computing. Kymatio is a Python package for applications at the intersection of deep learning and wavelet scattering. Its latest release (v0.4) provides an implementation of the joint time—frequency scattering transform (JTFS), which is an idealisation of a neurophysiological model that is commonly known in musical timbre perception research: the spectrotemporal receptive field (STRF) (Patil et al., 2012). In the MIR research, scattering transforms have demonstrated effectiveness in musical instrument classification (Vahidi et al., 2022), neural audio synthesis (Andreux et al., 2018), playing technique recognition and similarity (Lostanlen et al., 2021), acoustic modelling (Lostanlen et al., 2020), synthesizer parameter estimation and objective audio similarity (Vahidi et al., 2023, Lostanlen et al., 2023). The Kymatio ecosystem will be introduced with examples in MIR: - Wavelet transform and scattering introduction (including constant-Q transform, scattering transforms, joint time–frequency scattering transforms, and visualizations) - MIR with scattering: music classification - A perceptual distance objective for gradient descent Generative evaluation of audio representations (GEAR) (Lostanlen et al., 2023) A comprehensive overview of Kymatio’s frontend user interface will be given, with examples of extensibility of the core routines and filterbank construction. We ask our participants to have some prior knowledge in: - Python and NumPy programming (familiarity with Pytorch is a bonus, but not essential) - Spectrogram visualization - Computer-generated sounds No prior knowledge of wavelet or scattering transforms is expected

    Mesostructures: Beyond Spectrogram Loss in Differentiable Time–Frequency Analysis

    No full text
    Computer musicians refer to mesostructures as the intermediate levels of articulation between the microstructure of waveshapes and the macrostructure of musical forms. Examples of mesostructures include melody, arpeggios, syncopation, polyphonic grouping, and textural contrast. Despite their central role in musical expression, they have received limited attention in recent applications of deep learning to the analysis and synthesis of musical audio. Currently, autoencoders and neural audio synthesizers are only trained and evaluated at the scale of microstructure: i.e., local amplitude variations up to 100 milliseconds or so. In this paper, we formulate and address the problem of mesostructural audio modeling via a composition of a differentiable arpeggiator and time-frequency scattering. We empirically demonstrate that time--frequency scattering serves as a differentiable model of similarity between synthesis parameters that govern mesostructure. By exposing the sensitivity of short-time spectral distances to time alignment, we motivate the need for a time-invariant and multiscale differentiable time--frequency model of similarity at the level of both local spectra and spectrotemporal modulations
    corecore